Deep learning-based scoring of tumour-infiltrating lymphocytes is prognostic in primary melanoma and predictive to PD-1 checkpoint inhibition in melanoma metastases

Summary Background Recent advances in digital pathology have enabled accurate and standardised enumeration of tumour-infiltrating lymphocytes (TILs). Here, we aim to evaluate TILs as a percentage electronic TIL score (eTILs) and investigate its prognostic and predictive relevance in cutaneous melanoma. Methods We included stage I to IV cutaneous melanoma patients and used hematoxylin-eosin-stained slides for TIL analysis. We assessed eTILs as a continuous and categorical variable using the published cut-off of 16.6% and applied Cox regression models to evaluate associations of eTILs with relapse-free, distant metastasis-free, and overall survival. We compared eTILs of the primaries with matched metastasis. Moreover, we assessed the predictive relevance of eTILs in therapy-naïve metastases according to the first-line therapy. Findings We analysed 321 primary cutaneous melanomas and 191 metastatic samples. In simple Cox regression, tumour thickness (p < 0.0001), presence of ulceration (p = 0.0001) and eTILs ≤16.6% (p = 0.0012) were found to be significant unfavourable prognostic factors for RFS. In multiple Cox regression, eTILs ≤16.6% (p = 0.0161) remained significant and downgraded the current staging. Lower eTILs in the primary tissue was associated with unfavourable relapse-free (p = 0.0014) and distant metastasis-free survival (p = 0.0056). In multiple Cox regression adjusted for tumour thickness and ulceration, eTILs as continuous remained significant (p = 0.019). When comparing TILs in primary tissue and corresponding metastasis of the same patient, eTILs in metastases was lower than in primary melanomas (p < 0.0001). In therapy-naïve metastases, an eTILs >12.2% was associated with longer progression-free survival (p = 0.037) and melanoma-specific survival (p = 0.0038) in patients treated with anti-PD-1-based immunotherapy. In multiple Cox regression, lactate dehydrogenase (p < 0.0001) and eTILs ≤12.2% (p = 0.0130) were significantly associated with unfavourable melanoma-specific survival. Interpretation Assessment of TILs is prognostic in primary melanoma samples, and the eTILs complements staging. In therapy-naïve metastases, eTILs ≤12.2% is predictive of unfavourable survival outcomes in patients receiving anti-PD-1-based therapy. Funding See a detailed list of funding bodies in the Acknowledgements section at the end of the manuscript.


Introduction
Melanoma is an immunogenic tumour 1 that often exhibits various cell types in its microenvironment, including fibroblasts, endothelial cells, and immune cells. The interaction of tumour cells with the microenvironment plays a critical role in melanoma progression, mediating either tumour immunity or tumour promotion. Of the immune cells, lymphocytes infiltrating the tumour are called tumour-infiltrating lymphocytes (TILs). Yazdi et al. showed that heterogenous T-cell clones could infiltrate primary melanomas. 2 TILs include CD8+ T, CD4+ T, B cells, and NK cells, with CD8+ T cells being the most common subtype in melanoma and associated with a better prognosis. In contrast, other immune cells, including M2 macrophages, T-regulatory cells (Treg), and myeloid-derived suppressor cells (MDSC), act immunosuppressive, leading to tumour promotion. 3,4 The distribution pattern of TILs in melanoma is heterogeneous, ranging from stromal to peritumoral and intratumoral. Intratumoral TILs are found within nests of melanoma cells, peritumoral TILs at the invasive margin of the tumour, and stromal TILs in the stromal areas beyond the tumour border. 5,6 TILs have long been studied as a potential biomarker in melanoma, and various methods for assessing the immune infiltrate have been described. These include visual characterisation by a histopathologist, immunohistochemistry, multiplex immunofluorescence with quantification by image analysis tools, molecular methods based on gene expression signatures, determination of T-cell receptor (TCR)clonality, and proteomics. 7 Different scoring and classification systems have been proposed for TIL evaluation and quantification. The most sophisticated system was proposed by Clark et al., in 1989, which characterises TILs as absent, non-brisk, and brisk. 8 TILs are absent when no TILs are found within the tumour or exclusively in perivascular and fibrotic areas. Non-brisk infiltrate is defined as focal TILs found only at the tumour margins or the tumour base, whereas brisk TILs infiltrate the entire tumour or base. 9,10 A recent meta-analysis showed that brisk lymphocytes were associated with better diseasespecific survival. 11 However, Němejcová et al. compared different scoring systems, and TILs were not found to be an independent prognostic factor when these were assessed histopathologically. 12 The American Joint Committee on Cancer (AJCC) classification of melanoma (8th edition) is established for risk stratification of cutaneous melanoma patients based on overall survival. Tumour thickness, ulceration, and the presence of locoregional and distant metastases are the main criteria for stratifying patients. Early-stage cutaneous melanoma patients without metastases (stage I and II) are categorised into substages according to

Research in context
Evidence before this study There are conflicting results regarding the prognostic significance of TILs in primary melanoma. NN192 is an algorithm developed to standardise the method of TILs in primary melanoma. Moreover, no biomarkers are used in clinical practice to predict the outcomes of anti-PD-1-based immunotherapy.

Added value of this study
We validated that TILs quantification as eTILs using the deeplearning NN192 algorithm is prognostic in primary melanoma in stages IB-IIC. We showed a decrease in eTILs from primary melanoma to matched metastases reinforcing the immunoediting hypothesis. We demonstrated that quantifying TILs in therapy-naïve melanoma metastases is predictive of response and survival outcomes in patients treated with anti-PD-1-based immunotherapy.
Implications of all the available evidence eTILs complement staging for stage I/II melanoma patients to assess relapse-free survival, which could be applied in patients' follow-up and adjuvant therapy decisions. For early-stage BRAF-mutated melanoma, assessing whether patients should receive rather adjuvant BRAF inhibition due to low eTILs instead of immunotherapy is important. Currently, there are no predictive biomarkers in clinical practice for immunotherapy in stage III/IV melanoma patients. This study provides evidence that electronic quantification of TILs in melanoma metastases can be implemented as a predictive biomarker for immunotherapy. It highlights that patients with low eTILs are at higher risk of progression under immune checkpoint inhibition and that other options should be sought in these patients, i.e., BRAF/MEK inhibitors for BRAFmutated melanoma, T cell therapies, and targeted therapy according to genomic alterations for BRAF wild-type melanoma.
Articles tumour thickness and ulceration. [13][14][15] However, the relapse-free survival rate at 5 years of patients with a negative sentinel lymph node biopsy (SLNB) in Europe ranges from 76% to 90%, meaning that up to 25% of these patients relapse within 60 months despite the early stage of initial diagnosis. 16,17 Additional and easy-to-determine biomarkers are needed to identify high-risk patients at early stages so that the follow-up strategies, such as close monitoring or adjuvant therapy, can be adapted. Accurate prognostic and predictive biomarkers could avoid overtreatment of patients with immune checkpoint inhibitors and prevent potential toxicity caused by these therapies.
Predictive biomarkers of response and survival are needed in clinical practice for advanced melanoma patients treated with anti-PD-1-based immunotherapy, as 40-60% of these patients do not benefit. 18 Tumour mutation burden (TMB), 19,20 circulating tumour DNA (ctDNA), 21 specific genomic alterations, 22 or gene expression scores are costly markers and are not available in every centre. Blood markers, such as lactate dehydrogenase (LDH) and neutrophil-to-lymphocyte ratio (NLR), are associated with increased tumour burden and elevated inflammation levels, respectively. They are generally linked to poor prognosis but have only a modest predictive value. Therefore, more reliable predictive biomarkers are needed to complement LDH, routinely assessed for staging. Tumeh et al. showed that responders to immunotherapy had higher numbers of CD8+, PD-1+, and PD-L1+ cells in treatment-naïve melanoma metastases than non-responders, raising the question of whether the quantification of TILs can be included in the staging. 23 BRAF-mutated (BRAF V600E/K ) melanomas account for 40% of cutaneous melanomas, and the presence of the BRAF V600E/K is specifically predictive of response to the approved BRAF inhibitors. Immunotherapy, including anti-PD-1 antibodies, is effective in BRAF V600E/K and BRAF-wild type (BRAF wt ) melanoma. LDH is the only widely used predictive factor for both immune checkpoint inhibition and targeted therapy and is associated with poor prognosis for both therapies. More specific biomarkers are needed to predict resistance and response to immunotherapy, indicating which patients should receive immunotherapy and which should receive another treatment option.
In oncology, deep learning algorithms have recently revolutionised the field of biomarkers, leading to process standardisation. In many cancers, including colon, 24 breast, 25 and testicular cancer, 26 the quantification of TILs was found to be associated with prognosis. In primary melanoma, earlier publications have evaluated the deep learning algorithms NN192 and ADTA to assess TILs and demonstrated their prognostic significance independent of tumour thickness and ulceration. ADTA is based on the identification of patches, while NN192 uses granular analysis. [27][28][29] In metastatic melanoma, limited studies have addressed the prognostic and predictive significance of TILs quantification in hematoxylin-eosin-stained (H&E) sections. 10,30,31 In our study, we investigated the prognostic significance of eTILs assessed in primary cutaneous melanoma tissue in terms of relapse-free survival (RFS), distant metastasis-free survival (DMFS), and overall survival (OS). Moreover, we assessed the association of TILs in metastatic tissue and survival outcomes, including progression-free survival (PFS) and melanoma-specific survival (MSS) after first-line anti-PD-1-based therapy and targeted therapy with BRAF and MEK inhibitors.

Study design
We included patients diagnosed with stage IB to IV cutaneous melanoma between 2010 and 2018. Patients were treated at the academic skin cancer centres in Tuebingen (Germany), Dresden (Germany), and St. Gallen (Switzerland). Other inclusion criteria were: (1) age ≥18 years, (2) availability of a H&E-stained slide of the tumour, (3) good quality of the H&E slide and (4) tumour percentage above >5%. Additional exclusion criteria were: (1) the presence of features not accurately detected by the algorithm, particularly necrosis and pigment incontinence, (2) patients with a tumour thickness of less than 1 mm as the algorithm did not accurately identify eTILs in these samples and are known to have a low risk for relapse, (3) brain metastases were excluded if the patients had received steroid treatment for the management of their brain metastasis prior to the acquisition of the sample ( Supplementary  Fig. S1a). Quality of the H&E slide was assessed by a dermatopathologist and aimed to include samples that were appropriately stained with hematoxylin and eosin, appropriately preserved/fixed and free of any artefacts or debris (blebs, folds, and bubbles) that may obscure or distort the tissue or cell morphology samples.
We assessed the primary tissue of stage I/II patients and the treatment-naïve metastatic sample of stage III/ IV melanoma patients. 111 (34.6%) of patients diagnosed with stage I/II had a relapse (progression to stage III/IV) during the follow-up period (105 within 60 months). We also evaluated 89 H&E slides of the metastases excised between 2010 and 2019 that were available from these patients (Supplementary Fig. S1a and b).

Ethics
The study was approved by the ethics committee at the medical faculty of the Eberhard-Karls-University Tuebingen (approval number 883/2019BO2), the Technical University of Dresden (EK 48022018), and by the ethics committee for Eastern Switzerland (EKOS 16/079) and was conducted following the Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines. 32,33 Individual consent was obtained for the use of patients' clinical data.

Clinical data collection
The staging was performed according to the 8th Edition of the American Joint Committee on Cancer (AJCC) Staging Manual. 13 A blinded investigator assessed the response to anti-PD-1-based immunotherapy using the Response Evaluation Criteria in Solid Tumors version 1.1 (RECIST 1.1). 34 The Best Overall Response (BOR) was documented.

Determination of the electronic TIL score (eTILs)
Based on our workflow, histopathological diagnosis of cutaneous melanoma was confirmed by a certified dermatopathologist using a H&E-stained slide ( Supplementary Fig. S2) and slides with staining artefacts were excluded.
H&E-stained slides were digitised at a 40x magnification using a Hamamatsu Nanozoomer Digital Slide Scanner. The whole slide scans were analysed with the digital pathology software Qupath (version 0.1.2), 25 and regions of interest (ROI) were marked under the supervision of a dermatopathologist. To refine the H&E stain estimates for each digitised slide and account for staining variation, we used the "estimate stain vectors" command in QuPath to set the stain and background vectors. Before running the command, we manually selected a small representative region containing clear examples of the staining and a representative background area. We employed watershed cell detection to segment the cells in the image using the following settings: detection image: hematoxylin OD; requested pixel size: 0.5 μm; background radius: 8 μm; median filter radius: 0 μm; sigma: 1.5 μm; minimum cell area: 10 μm 2 ; maximum cell area: 400 μm 2 ; threshold: 0.1; and maximum background intensity. The wand tool was used to select the tumour area and perform cell detection and segmentation. When this was not feasible in larger melanomas, we selected multiple regions or one large representative region from the invasive front up to the superior margin of the tumour. We performed visual segmentation quality control and added smoothed object features at 25 μm and 50 μm radius.
We used the network classifier NN192, which was trained to identify TILs compared to cancer cells, stromal and other cells on H&E sections in melanoma, as published by Acs et al. 27 Using this algorithm that is publicly available on the GitHub platform, we calculated the eTIL score as eTILs = [TILs/(TILs + tumour cells)]*100% 27,35 ( Supplementary Fig. S3). The evaluated TILs included intratumoral TILs within the predefined area and lymphocytes at the tumour base. Immune infiltrates beyond the invasive front were not calculated. The operator was trained on 30 melanoma samples and was blinded to the clinical data before using the algorithm. 27 In 15 primary samples, the granular classifier for TILs identification in 15 primaries had an accuracy of 80%, F1 score of 82%, and recall of 79%, while in 15 metastases had an accuracy of 82%, F1 score of 79%, and recall 72%. Performance scores were assessed at the interface of tumour-infiltrating lymphocytes and tumour cells against visual classification and annotation of the cells in this region. All cases were checked for potential inaccuracies, such as misclassification of cells due to pigment incontinence or apoptotic cells, through visual inspection after neural network algorithm classification and excluded if necessary.

Statistics
Summary statistics were reported. Categorical data were reported with counts (n) and percentages (%), and continuous variables were reported as a median and interquartile range [IQR]. We used the Wilcoxon-Mann-Whitney U test to compare continuous and categorical variables between two groups. For comparisons of a variable with more than two groups, the Kruskal-Wallis test was applied with Dunn's test for multiple comparisons. For comparisons between two categorical variables, Pearson's chi-squared test was used.
We calculated relapse-free survival (RFS), distant metastasis-free survival (DMFS), and overall survival (OS) of stage I/II patients as the time between the date of primary melanoma diagnosis and date of the event or the censoring date (last contact with the patient). Death due to all causes was considered an OS event. We dichotomised the stage I/II cohort using the published cut-off for eTILs of 16.6%. 27,29 We assessed both scores for patients for whom both primary and metastatic tissue samples were available. A comparison between eTILs in primary tumours and their matched metastases was made using the nonparametric Wilcoxon matched-pairs signed-rank test and considering the type of metastatic tissue.
For stage III/IV, we assessed whether there is an association between TILs and survival outcomes in patients receiving PD-1 checkpoint inhibition or targeted therapy with BRAF and MEK inhibitors. The endpoints for this analysis were progression-free survival (PFS) and melanoma-specific survival (MSS). We defined PFS as the time between the start of systemic treatment and the date of progression or the last follow-up as the censoring date when no progression occurred. MSS was the time between the therapy start and the date of death or the last follow-up as the censoring date. The optimal cut-off that divided patients into two groups in terms of PFS was determined using the R package "Evaluate Cutpoints". 36 Censored data were analysed using the Kaplan-Meier method. A comparison of survival between the group with high eTILs (>16.6%) and low eTILs (≤16.6%) was made using the two-sided log-rank test. Median time-to-event and landscape rates were calculated using the Kaplan-Meier estimator, and hazard ratios (HR) were determined using the Cox Proportional Hazards Model. We used the likelihood ratio test to compare different nested models (stage versus stage plus eTILs group), and the Harrell concordance index (C-index, ranging from 0.5 and no concordance to 1.0 and perfect concordance of the model) was applied to compare the discriminatory ability of the models. The likelihood ratio test is used to compare two nested models, with the complex model differing only by the addition of some variables to the simpler model, and assesses whether this addition is statistically significantly different, making the complex model more accurate. Patient subgroups were defined according to their clinicopathological characteristics, and subgroup analysis was performed without correction for multiple comparisons.
We evaluated the effect of eTILs on staging by plotting Kaplan-Meier curves for patients stratified by both factors.
We conducted an evaluation of the linear relationship between the log-hazard (β coefficient) and the continuous eTILs. To test log-linearity, we plotted the continuous variable (eTILs) against the martingale residuals of a null Cox proportional hazard model. As this assumption was met, we proceeded to investigate the association between continuous eTILs and RFS and/or DMFS using the Cox proportional hazards model. To determine whether the linear model was appropriate, we used the likelihood ratio test to compare the model with linear eTILs to another model with restricted cubic splines of eTILs.
The concordance index (C-index) and likelihood ratio test were used to compare the model that included eTILs score and staging with the model based on staging alone. Furthermore, we plotted 5-year RFS against continuous eTILs for the different stages. Using the "regplot" package in R, we established a nomogram that incorporates eTILs score and staging to predict metastasis based on the results of the multiple Cox model.

Role of the funding source
The funding sources were not involved in the study design, analysis, data interpretation, writing and submission of the manuscript.
There was a statistically significant difference in RFS between the group with high (>16.6%) and low eTILs (≤16.6%) (p = 0.001, log-rank). The median RFS of the group with low eTILs was 57 months (95% CI, 35-NR), while it was not reached in the group with high eTILs. The 5-year RFS rate was 48.5% (95% CI, 37.1-59) in the low eTILs group and was inferior to this of the high eTILs group with 69.5% (95% CI, 62.5-75.5) (Fig. 1a  and b).
With a hazard ratio of 2.6 (95% CI, 1.49-4.53; p = 0.0007), the risk of developing distant metastasis was more than double in the low eTILs group.
In a subgroup analysis for 5-year RFS, the low eTILs group was associated with a significantly less favourable outcome for females, patients older than 65 years, patients with melanomas of the head, neck, and trunk, with SSM or BRAF wt melanomas, with tumours thicker than 4 mm, or with ulcerated and non-ulcerated melanomas. Furthermore, in stage II patients, those with low eTILs had significantly worse RFS than those with high eTILs (Fig. 1e).
In a subgroup analysis of the 5-year DMFS, the low eTILs group performed significantly worse for both sexes, patients older than 65 years, patients with melanomas of the head, neck, and trunk, with SSM or BRAF wt melanomas, with tumours thicker than 4 mm, or with ulcerated and non-ulcerated melanomas. Furthermore, in stage II patients, those with low eTILs had significantly worse DMFS than those with high eTILs (Fig. 1f).
To determine if eTILs adds discriminatory information to staging with respect to 5-year RFS, we calculated the C-index of the two corresponding models. The model estimating melanoma substages only (IB-IIC) had a C-index of 0.654 for RFS and 0.653 for DMFS, while the model, including eTILs, improved to 0.669 for RFS and 0.683 for DMFS, suggesting that the inclusion of the eTILs provided a better discriminative ability for both RFS and DMFS. We then compared the two nested models using the log-likelihood ratio test. The loglikelihood value of the model with the addition of eTILs was higher than the log-likelihood value with only the stage. This difference was statistically significant

Articles
(p = 0.0173 for RFS, p = 0.0077 for DMFS), indicating that adding eTILs as a variable improves the goodness of fit and results in a more accurate model. We used a Kaplan Meier plot for RFS to determine whether categorical eTILs up-or downgrade the current staging. The staging in our cohort showed 5-year RFS rates in accordance with the published ones (Fig. 2a). When we added the eTILs with the stated cut-off, low eTILs downgraded the staging (Fig. 2b).
Since, from a biological perspective, the more TILs, the better the prognosis, we also evaluated the prognostic significance of eTILs as a continuous variable. In the simple Cox regression, higher eTILs scores were associated with improved RFS (HR = 0.97; 95% CI, 0.96-0.99; p = 0.0015) and DMFS (HR = 0.97; 95% CI, 0.94-0.99; p = 0.0056). Furthermore, a near-linear relationship between 5-year RFS and continuous eTILs was found when we plotted both variables. This illustrates that higher eTILs scores correlate with a better prognosis in terms of 5-year RFS (Fig. 2c). The same relationship was observed when the data were divided by substages (Fig. 2d).
eTILs as a continuous variable remained significant (HR = 0.98, 95% CI, 0.97-0.99, p = 0.019) in multiple Cox regression for 5-year RFS, including tumour thickness (p = 0.0003) and ulceration (ref: no ulceration; HR = 1.45, 95% CI, 0.95-2.22, p = 0.0841) ( Table 3). Both the staging (p < 0.001) and the eTILs (HR = 0.98, 95% CI; 0.96-0.99, p = 0.02) were significant in the multiple Cox regression for 5-year RFS. The C-index of the staging with eTILs as a continuous variable was 0.676 for RFS and 0.666 for DMFS. The likelihood ratio test revealed a significant superiority of the model that included eTILs (p = 0.0168 for RFS and 0.039 for DMFS). Continuous eTILs and staging were integrated into a nomogram to easily predict the relapse probability of early-stage melanoma patients (Fig. 2e).  Table S2). A total of 30 (15.7%) metastatic patients did not receive any of these systemic treatments, as adjuvant therapy for R0 (microscopically margin-negative resection) patients had not yet been approved, patients received further treatment outside the clinic, received alternative therapies because of very rapid progression, or patients decided against system therapy. Metastasectomy or biopsy were conducted either as therapy (to obtain a R0 status) or for diagnostic purposes. eTILs of treatmentnaive metastases was not significantly different between patients younger and older than 65 years (p = 0.8084, Mann-Whitney U), male and female patients (p = 0.6157, Mann-Whitney U), BRAF V600E/K and BRAF wt metastases (p = 0.2516, Mann-Whitney U), patients with presence or absence of brain metastasis at therapy start (p = 0.330, Mann-Whitney U) and patients with elevated or normal LDH at therapy start (p = 0.0913, Mann-Whitney U). However, there was a trend for patients with higher eTILs to have lower LDH at therapy start (median 15.02 vs 11.52). No significant differences were found. However, the median eTILs was highest in metastases of the lung (median: 15 Fig. S7).

Survival analysis
We applied the Evaluate Cutpoints package in R 36 using progression-free survival (PFS) as an endpoint to determine the optimal threshold for eTILs that separated the cutaneous advanced unresectable melanoma patients that received anti-PD-1-based immunotherapy (n = 101) into two prognostically different groups. We obtained 12.2% as the optimum cutpoint for the eTILs, and those with ≤12.2% (low eTILs) had a significantly shorter PFS (p = 0.037, log-rank) and melanoma-specific survival (MSS) (p = 0.0038, log-rank) than those with eTILs >12.2% (high eTILs). The median PFS in the low eTILs group was only 6 months (95% CI, 3-13), while it was 22 months (95% CI, 7-NA) in the high eTILs group (Fig. 4c) 3) in the low eTILs group (Fig. 4d).
In patients receiving targeted therapy (n = 32), there was no significant difference in prognosis according to the eTIL score (PFS: p = 0.31; MSS: p = 0.44, log-rank). In patients receiving adjuvant anti-PD-1-based therapy (n = 21), there was a trend toward shorter PFS for the group of patients with low eTILs (≤12.2%), but neither PFS (p = 0.24, log-rank) nor MSS (p = 0.72, log-rank) reached significance due to the small sample size ( Supplementary Fig. S8a-d).

Discussion
Our study is a large-scale study that first assessed the prognostic relevance of eTILs in patients diagnosed at early stages. Primary tissue of 321 early-stage melanoma patients with a 5-10-year follow-up period was assessed, and therefore robust RFS, DMFS, and OS data were analysed. RFS and DMFS were considered optimal endpoints. The primaries of stage IB and IIC were excised before 2018, before the approval of early-stage adjuvant therapies, and therefore RFS and DMFS were not influenced by potential systemic therapy. We demonstrated the prognostic relevance of eTILs as a It shows the Kaplan-Meier curves for RFS and the number of melanoma patients at risk at specific time points starting from primary melanoma diagnosis. (c) Distant metastasis-free survival analysis for the 321 stage IB to IIC melanoma patients according to the eTILs group using the cutoff of 16.6%. It shows the Kaplan-Meier curves for DMFS and the number of melanoma patients at risk at specific time points beginning from the date of primary melanoma diagnosis. (d) Overall survival analysis for the 321 stage IB to IIC patients according to the eTILs group using the cut-off of 16.6%. It shows the Kaplan-Meier curves for OS and the number of melanoma patients at risk at specific time points starting from primary melanoma diagnosis. (e) Simple subgroup analysis for 5-year RFS for the low eTILs group ≤16.6% (reference: high group). The graph shows the unadjusted HR for RFS of the low eTILs group in the different patient subgroups. (f) Simple subgroup analysis for 5-year DMFS for the low eTILs group ≤16.6% (reference: high group). The graph shows the unadjusted HR for DMFS of the low eTILs group in the different patient subgroups. HR was not calculated if the events (relapses) were less than 10 in this subgroup. Abbreviations: HR, Hazard ratio; CI, Confidence interval; SSM, Superficial spreading melanoma; NM, Nodular melanoma; LMM, Lentigo malignant melanoma; ALM, Acral lentiginous melanoma; y, Years; extr., Extremities; Dx, Diagnosis; NA, Not applicable; HR was not calculated if fewer than ten events occurred in this patient subgroup; *Patients for whom information was unknown were not shown.
continuous variable in primary early-stage (IB-IIC) cutaneous melanoma, in addition to tumour thickness and ulceration as the main determinants of disease stage. We validated the published eTILs cut-off of 16.6% 27 by finding that the two groups with high and low eTILs had significantly different RFS, DMFS, and OS, with the low eTILs group having the worse prognosis. The eTILs remained significant in the multiple Cox regression analysis adjusted for tumour thickness and ulceration; variables currently included in the actual AJCC staging (8th edition). This result is consistent with previous publications associating eTILs with diseasespecific and relapse-free survival in melanoma patients. 27,29,37 The conflicting results of previous publications regarding the prognostic value of TILs in primary melanoma assessed by histopathologists were most likely due to a lack of standardisation of the quantification method. 12 With the introduction of NN192, there was the first evidence for the general prognostic significance of TILs quantified with the help of machine learning in cutaneous melanoma. 37 These results render eTILs calculated with the deep learning algorithm NN192 a robust prognostic biomarker. eTILs could complement AJCC staging for stages IB through IIC to assess the risk of relapse and metastasis, as indicated in our analysis by the increase in the C-index after adding eTILs to the staging and by the log-likelihood ratio test showing that the two nested models were significantly different. As adjuvant therapies are approved for earlier stages, there is a need for a reliable and easy-to-perform risk assessment for RFS and DMFS. Consequently, patients having low eTILs are at higher risk of relapse and distant metastasis and could be offered closer follow-up or appropriate adjuvant therapies. 27,28 The prognostic value was independent of the BRAF mutation status, as shown by the subgroup analysis of patients with either BRAF wild type or BRAF mutated melanoma.
Our observed reduction in TILs from primary melanoma to matched metastasis supports the immunoediting hypothesis that immune escape and antigen modification during the multistep process of metastasis lead to reduced infiltration of tumour tissue by immune cells that suppress tumour growth. This observation is consistent with findings in breast cancer, showing that the tumour microenvironment influences progression and immune evasion and therefore is crucial for the development of metastases. [38][39][40] We showed that the intransit and distant metastases had significantly lower eTILs than the matched primary melanomas, although this decrease was not significant for lymph node metastases. One explanation for this would be that the algorithm cannot discriminate normal lymphocytes in lymph nodes and TILs.
As a further point, we have assessed the predictive relevance of eTILs in advanced stages, where systemic therapy is needed. By analysing 101 treatment-naïve metastatic samples of patients receiving non-adjuvant first-line anti-PD-1-based ICI after entering stage III/ IV, we demonstrated that patients with high eTILs in their metastasis had better PFS and MSS than those with lower scores. Most metastases were skin metastases, either satellites/in-transit or distant skin metastases. We focused on patients treated with first-line immunotherapy, as previous therapies could alter the tumour microenvironment, leading to increased infiltration of the tumour by cells with immunosuppressive capabilities. Tumeh et al. already showed that responders to immunotherapy had more CD8+, PD-1+, and PD-L1+ cells in their samples than non-responders. 23 Our data extend this result to survival outcomes and propose using the eTILs in the clinic, which measures lymphocytes, including CD8+ cells and other subtypes, 37    in other cancers, showing that machine learning-based calculations of TILs were associated with response to immunotherapy. 41 eTILs ≤12.2% and LDH remained significant in multiple Cox regression, making the combination a good predictor of survival outcomes after immune checkpoint inhibition. The eTILs appears to be a more specific biomarker for immune checkpoint inhibition than LDH, which could help identify patients who should or should not receive ICI. For targeted therapies with BRAF/MEK inhibitors, there was no statistically significant difference between low (≤12.2%) and high eTILs (>12.2%). However, we cannot make a general conclusion due to the small sample size.
Recently a publication showed that TILs were predictive of cutaneous immune-related adverse events in advanced melanoma patients treated with PD-1 inhibitors. 42 Considering that the presence of cutaneous toxicities has been associated with better response outcomes, this reinforces our findings. 43 Moreover, a phase I trial showed that studies assessing CD8+ TILs in cancer lesions using PET-CT is safe and can be predictive of response outcomes. 44 CD8+ cells, M0, and M2 macrophages represent melanoma's most frequent immune cell populations. 45 Therefore, the eTILs calculated by NN192 are mainly determined by CD8+ cells and macrophages. 45 Antoranz et al. showed that the spatial distribution of cytotoxic T cells and PD-L1+ macrophages predicted response to anti-PD-1 immunotherapy in melanoma. 46 Using indepth single-cell RNA data, Tirosh et al. revealed the variability in the activation states and the clonal expansion of T cells across melanoma patients. 47 Therefore, the location and functional characteristics of TILs within the melanoma microenvironment can have a different prognostic and predictive impact. Immune cells, according to their exact subtype and their activation status, whether they are exhausted, effector T cells or effector memory cells, have a different impact on tumour progression (for example, Th1, Th2, Th9, Th17, Th22, or Treg for CD4+ T cells; M0, M1, M2 for macrophages; Tc1, Tc2, Tc9, Tc17, Tc22 for CD8+ T cells; cDC1, cDC2, pDc, moDc, iDc, Langerhans cells for dendritic cells; or N1 and N2 for neutrophils). Through cytokine secretion, the CD8+ and CD4+ Th1 T cells are the primary effector cells associated with a good prognosis, while other CD4+ T cell subsets (Th2, Th17), myeloid suppressor cells, M2 macrophages, and Treg cells cause tumour promotion. 48 Also, high pre-treatment clonality of the TILs has been associated with better OS of melanoma patients treated with anti-PD1 therapies. 49 As a consequence of our results, patients with high eTILs and a relevant risk of relapse (IIB and above) or advanced melanoma patients (stage III/IV) should receive anti-PD-1 therapy for both BRAF wild type and BRAF-mutated melanoma. This concept is also supported by the DREAMseq trial demonstrating that patients with BRAF-mutated melanoma receiving first-line ICI and then targeted therapy had a better prognosis than those receiving initially targeted therapy and, subsequently, ICI. However, TILs or CD8+T cells were not assessed in this trial. 50 Klein et al. also showed that clusters of TILs in BRAF V600E/K melanoma were more likely to be associated with a better prognosis and response to immunotherapy than in NRAS-mutated or BRAF/NRAS wild-type melanoma. 31 Concerning BRAF-mutated melanoma with low TILs, the optimal therapeutic strategy should be sought, as BRAF inhibition can have immunomodulatory effects by increasing the number of TILs or reprogramming them. 51 Ascierto and colleagues demonstrated that patients (stage IIC to IIIC) with BRAF V600E/K melanoma and low TILs had a greater benefit in terms of diseasefree survival after adjuvant BRAF inhibition (vemurafenib) compared to those with high TILs. 52 Therefore, targeted therapy with BRAF inhibitors may be an option for the group with low eTILs either as a combination or as a sequential therapy. Clinical trials evaluating TILs should address this topic in the future.
Since patients with low eTILs seem to benefit less from anti-PD-1-based therapy. Novel therapies such as   How clinicians can apply NN192 or other deeplearning algorithms to cutaneous melanoma patients remains to be addressed. The guideline-compliant treatment for stage I/II melanoma is the complete surgical removal of the primary melanoma and adjuvant immunotherapy for stages IIB/IIC. 54 Our results show that low eTILs downgrades staging throughout the early stages. Therefore, patients in stages IB to IIA with high TILs should be followed up according to guidelines. In contrast, patients with low eTILs should be monitored more closely or even considered candidates for adjuvant therapy from our point of view. We propose a model of how TIL can be applied in the clinic regarding follow-up and therapeutic strategies (Supplementary Fig. S9). Clinical trials are needed to investigate this.
In several other cancers, including breast, lung, and colorectal cancer, TILs are accepted as prognostic and predictive biomarkers. In breast cancer, TILs are considered a marker of tumour immunogenicity. Higher levels of stromal TILs were strongly associated with better prognosis in early-stage triple-negative breast cancer and HER2-positive breast cancer and better clinical outcomes in patients treated with adjuvant, neoadjuvant chemotherapy, or PD-1 inhibitors. The International Immuno-Oncology Biomarker Working Group, also known as the International TILs Working Group, has recently developed guidelines for evaluating TILs in breast cancer, proposing a standardised scoring system that considers the percentage and distribution of TILs and has recommended its use in clinical trials. 5,45,48,[55][56][57] With the help of this, TILs were shown to be predictive for therapy with the anti-PD-1 pembrolizumab but not for chemotherapy using anthracyclines or/and taxanes in a clinical trial for metastatic triple-negative breast cancer. 58 CD8-positive TILs are also associated with better prognosis and response to therapy in non-small cell lung cancer. 41,59,60 Similarly, TILs were found to be beneficial in colorectal cancer, leading to the well-known example Immunoscore, which measures tumour CD3+ and CD8+ lymphocyte density and has been incorporated into the current staging system (TNM-I). 24 TILs are also associated with prognosis in several other types of cancer, including head and neck, gastric and ovarian cancer. [61][62][63] Along with the advantages of the present study, some limitations remain. First, it is a retrospective study. The scanner used to digitise the H&E stains differed from the one used to develop the algorithm. Although it is an automatic algorithm, the process can only be considered semi-automatic because it is operator-dependent, and intensive operator training was essential for the accurate quantification of TILs and melanoma cells. A misclassification of cells was initially avoided by calculating the performance metrics and obligatory visual inspection, but the possibility of an error stills remains for the granular classifier. Immune cells beyond the invasive tumour margin, immune cell subtypes, phenotypes, spatial localisation, or the functional status of TILs were not analysed, and we did not evaluate tertiary lymphoid structures or germinal centres. 29 Most metastases assessed were distant skin or in transit and satellites. A few high-score patients developed metastasis in the stage I/II cohort, which may be attributed to lymphocytes with tumour-suppressive capacities or cells misclassified as TILs. A more detailed characterisation of these samples in the future will provide information on  Articles how to improve the accuracy of the eTILs or whether additional staining is required for certain markers.
Although we observed a difference in progression-free survival (PFS) using the eTILs cut-off in metastatic samples at baseline, several patients with high scores exhibited disease progression, highlighting the need for a combination of biomarkers that can better discriminate these patients. [18][19][20][64][65][66][67] Such additional biomarkers could be tumour mutational burden, specific mutations, microsatellite instability, microenvironmental factors and many others.
Further research is needed, and challenges exist before implementing such deep-learning scores in the clinic. 68 The inclusion of eTILs or similar TIL scores, which could even be generated by manual counting if slide scanners are not available, in the pathology report is the first step that makes extensive multicentre studies possible. They should assess the variability among operators, slide scanners, and centres due to different pathology processing methods. However, standardisation is essential. The International TILs Working Group has already developed guidelines for    scoring TILs in breast cancer, widely used in clinical and research settings. The development of guidelines for H&E-based TILs assessment in melanoma, similar to those developed for breast cancer, should be performed. They should include recommendations for the staining protocol, the definition of the region of interest and the quantification method. It should also be addressed how to simplify the deep learning classifiers in a user-friendly manner and how to standardise their use. 28 While machine learning algorithms can increase the efficiency and accuracy of TIL classification and quantification, a trained pathologist must review and verify the results of these algorithms, and a training phase is essential prior to implementation to adapt the classifier to the centre's specifications to achieve optimal accuracy in TIL enumeration. Acknowledging that deep learning classifiers are not a substitute for human expertise is important. Therefore, pathologists should be trained appropriately. They should learn how to fine-tune the algorithms, evaluate their performance and be aware of potential limitations and pitfalls.
Overall, the enumeration of TILs as eTILs is an affordable and technically feasible prognostic and predictive biomarker based on simple diagnostic H&E staining and a publicly available algorithm without the need for costly methods, including sequencing or immunohistochemistry (Fig. 5).
The present study confirms the prognostic significance of TILs in early-stage primary cutaneous melanoma and shows that they could complement the current AJCC classification for determining the risk of relapse. Our results also show that eTILs in therapynaïve cutaneous melanoma metastases are a potential predictive biomarker for response and survival outcomes to anti-PD-1-based immunotherapy. Data sharing statement All scanner-based imaging data is centrally managed through the University-wide data management infrastructure at the Quantitative Biology Center (QBiC) using the Open Microscopy Environment (OME) Remote Objects (OMERO) infrastructure. 69 The imaging metadata is standardised using the bio-formats framework. 70 All data is accessible via qbic. life and can be made available upon reasonable request and with permission of the corresponding author after an appropriate data access agreement specifying the terms and conditions of use of the data. The other authors report no potential conflicts of interest.